This is the very first exercise of this course. It’s not too hard, its purpose is instead to get used to our exercise format - this HTML file, for instance - and more importantly, to get used to the tidyverse world.

Just two short notes on conducting exercises in this course in general:

  1. We’d like to ask you to perform all tasks by writing them in your own R code file. This ensures that all of your solutions are reproducible, and that you can re-use solutions from earlier exercises in later ones.
  2. All exercises and their solutions think they are in the ./solutions folder of this repository. This way they can make use, for example, of the data folder using this relative path "../data/. Please make sure to follow this concept by copying your scripts in the solutions folder and set the working directory accordingly using setwd().

Again, for starters, the exercise now is really short and just let you play around with some pipes and tibbles. It’s a mini-exercise!

First things first: To work with the ‘tidyverse’, we have to have access to its packages.

1

Load the tidyverse library.
If the tidyverse library has not been installed yet, you can install it with the command install.packages("tidyverse").
if (!require(tidyverse)) install.packages("tidyverse")
library(tidyverse)

After successfully loading the tidyverse library we turn to the magic world of pipes. Remember, pipes are a convenient way to disentangle nested R functions and to write cleaner R code. First, have a look at the code in the following block:

mean(sqrt(as.numeric(read.csv2("../data/titanic/titanic.csv", sep = ",")$Fare)))

2

What do you think is the command doing?
It’s always a good approach to start from the inner command to the outer ones.
  1. The titanic data are imported with read.csv2()
  2. Only the Fare variable is extracted using the $-sign
  3. The variable is converted to the numeric format with as.numeric()
  4. A square root transformation is applied with sqrt()
  5. The mean is calculated with mean()

Using the commands in such a way is not very clear, isn’t it? You have already learned that pipes provide a straightforward approach to navigate this issue.

3

Create a nice pipe from this nested command.
You can call individual columns of a piped object with .$col_name.
read.csv2("../data/titanic/titanic.csv", sep = ",") %>% 
  .$Fare %>% 
  as.numeric() %>% 
  sqrt() %>% 
  mean()
## [1] 10.46045

As we’ve learned, tidyverse is not all about pipes, it’s about tidy data. The default data format to have access to tidy data in tidyverse is the tibble format. In the previous task, you have already imported the titanic data, but it’s in the standard data.frame format.

4

Load the titanic dataset and convert it immediatly to a tibble.
base-R’s read.csv2() is your friend. Also, you may want to do it all at once in one pipe.
titanic_tibble <-
  read.csv2("../data/titanic/titanic.csv", sep = ",") %>% 
  as_tibble()

Now, look at the following data.frame. It’s been created with the standard base-R tools. tidyverse also provides a feature, the tribble() command, to create small data tables as tibbles from scratch.

##   day amount_coffee words_written
## 1   1             2           245
## 2   2             5           691
## 3   3             1            10
## 4   4             8          2100
## 5   5             4           490

5

Use the tribble()-function to directly rebuild the data frame as a tibble.
Remember defining new columns with a preceding ~.
tribble(
  ~day,  ~amount_coffee,   ~words_written, 
  1,                 2,              245,
  2,                 5,              691,
  3,                 1,               10,
  4,                 8,             2100,
  5,                 4,              490
)
## # A tibble: 5 x 3
##     day amount_coffee words_written
##   <dbl>         <dbl>         <dbl>
## 1     1             2           245
## 2     2             5           691
## 3     3             1            10
## 4     4             8          2100
## 5     5             4           490